Dataset statistics
| Number of variables | 16 |
|---|---|
| Number of observations | 973797 |
| Missing cells | 169356 |
| Missing cells (%) | 1.1% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 118.9 MiB |
| Average record size in memory | 128.0 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 4 |
time has a high cardinality: 40495 distinct values | High cardinality |
gameId is highly correlated with team | High correlation |
frameId is highly correlated with s and 1 other fields | High correlation |
s is highly correlated with dis | High correlation |
dis is highly correlated with s | High correlation |
team is highly correlated with gameId | High correlation |
nflId has 42339 (4.3%) missing values | Missing |
jerseyNumber has 42339 (4.3%) missing values | Missing |
o has 42339 (4.3%) missing values | Missing |
dir has 42339 (4.3%) missing values | Missing |
s has 62663 (6.4%) zeros | Zeros |
a has 58410 (6.0%) zeros | Zeros |
dis has 61876 (6.4%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-02 15:00:23.316286 |
|---|---|
| Analysis finished | 2022-11-02 15:01:51.668211 |
| Duration | 1 minute and 28.35 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 14 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2021101693 |
| Minimum | 2021101400 |
|---|---|
| Maximum | 2021101800 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 2021101400 |
|---|---|
| 5-th percentile | 2021101400 |
| Q1 | 2021101702 |
| median | 2021101706 |
| Q3 | 2021101709 |
| 95-th percentile | 2021101800 |
| Maximum | 2021101800 |
| Range | 400 |
| Interquartile range (IQR) | 7 |
Descriptive statistics
| Standard deviation | 80.73039883 |
|---|---|
| Coefficient of variation (CV) | 3.994375895 × 10-8 |
| Kurtosis | 8.362886128 |
| Mean | 2021101693 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -2.887787025 |
| Sum | 1.968142765 × 1015 |
| Variance | 6517.397296 |
| Monotonicity | Increasing |
| Value | Count | Frequency (%) |
| 2021101707 | 82570 | 8.5% |
| 2021101702 | 81259 | 8.3% |
| 2021101700 | 80661 | 8.3% |
| 2021101709 | 80546 | 8.3% |
| 2021101704 | 75969 | 7.8% |
| 2021101800 | 73278 | 7.5% |
| 2021101706 | 71415 | 7.3% |
| 2021101710 | 69598 | 7.1% |
| 2021101708 | 63779 | 6.5% |
| 2021101701 | 63181 | 6.5% |
| Other values (4) | 231541 |
| Value | Count | Frequency (%) |
| 2021101400 | 62537 | |
| 2021101700 | 80661 | |
| 2021101701 | 63181 | |
| 2021101702 | 81259 | |
| 2021101703 | 60030 | |
| 2021101704 | 75969 | |
| 2021101705 | 52739 | |
| 2021101706 | 71415 | |
| 2021101707 | 82570 | |
| 2021101708 | 63779 |
| Value | Count | Frequency (%) |
| 2021101800 | 73278 | |
| 2021101711 | 56235 | |
| 2021101710 | 69598 | |
| 2021101709 | 80546 | |
| 2021101708 | 63779 | |
| 2021101707 | 82570 | |
| 2021101706 | 71415 | |
| 2021101705 | 52739 | |
| 2021101704 | 75969 | |
| 2021101703 | 60030 |
playId
Real number (ℝ≥0)
| Distinct | 903 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2172.958596 |
| Minimum | 55 |
|---|---|
| Maximum | 5223 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 55 |
|---|---|
| 5-th percentile | 257 |
| Q1 | 1054 |
| median | 2145 |
| Q3 | 3245 |
| 95-th percentile | 4158 |
| Maximum | 5223 |
| Range | 5168 |
| Interquartile range (IQR) | 2191 |
Descriptive statistics
| Standard deviation | 1269.613603 |
|---|---|
| Coefficient of variation (CV) | 0.5842787826 |
| Kurtosis | -1.108311386 |
| Mean | 2172.958596 |
| Median Absolute Deviation (MAD) | 1099 |
| Skewness | 0.09770258967 |
| Sum | 2116020562 |
| Variance | 1611918.701 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 923 | 4186 | 0.4% |
| 3837 | 2806 | 0.3% |
| 1063 | 2806 | 0.3% |
| 2739 | 2760 | 0.3% |
| 1757 | 2714 | 0.3% |
| 3906 | 2691 | 0.3% |
| 1778 | 2645 | 0.3% |
| 55 | 2599 | 0.3% |
| 3130 | 2576 | 0.3% |
| 1427 | 2553 | 0.3% |
| Other values (893) | 945461 |
| Value | Count | Frequency (%) |
| 55 | 2599 | |
| 56 | 667 | 0.1% |
| 62 | 989 | 0.1% |
| 63 | 897 | 0.1% |
| 73 | 2185 | |
| 76 | 667 | 0.1% |
| 83 | 874 | 0.1% |
| 94 | 690 | 0.1% |
| 95 | 713 | 0.1% |
| 96 | 1173 |
| Value | Count | Frequency (%) |
| 5223 | 989 | |
| 5133 | 1196 | |
| 5087 | 1196 | |
| 5036 | 782 | |
| 4883 | 782 | |
| 4801 | 989 | |
| 4775 | 1219 | |
| 4746 | 598 | |
| 4692 | 874 | |
| 4663 | 713 |
| Distinct | 1025 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 42339 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45822.98825 |
| Minimum | 25511 |
|---|---|
| Maximum | 53957 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 25511 |
|---|---|
| 5-th percentile | 37724 |
| Q1 | 42476 |
| median | 46070 |
| Q3 | 48034 |
| 95-th percentile | 53496 |
| Maximum | 53957 |
| Range | 28446 |
| Interquartile range (IQR) | 5558 |
Descriptive statistics
| Standard deviation | 4982.562942 |
|---|---|
| Coefficient of variation (CV) | 0.1087350069 |
| Kurtosis | -0.01482357479 |
| Mean | 45822.98825 |
| Median Absolute Deviation (MAD) | 3239 |
| Skewness | -0.1889764522 |
| Sum | 4.268218899 × 1010 |
| Variance | 24825933.47 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 38592 | 2427 | 0.2% |
| 47810 | 2427 | 0.2% |
| 38642 | 2427 | 0.2% |
| 53472 | 2427 | 0.2% |
| 52491 | 2427 | 0.2% |
| 47824 | 2427 | 0.2% |
| 43384 | 2427 | 0.2% |
| 41258 | 2427 | 0.2% |
| 46109 | 2216 | 0.2% |
| 52938 | 2181 | 0.2% |
| Other values (1015) | 907645 | |
| (Missing) | 42339 | 4.3% |
| Value | Count | Frequency (%) |
| 25511 | 1162 | |
| 28963 | 1306 | |
| 29550 | 1541 | |
| 29851 | 1069 | |
| 30842 | 540 | 0.1% |
| 30869 | 1014 | |
| 33107 | 1139 | |
| 33130 | 429 | < 0.1% |
| 33131 | 1194 | |
| 34452 | 1130 |
| Value | Count | Frequency (%) |
| 53957 | 1278 | |
| 53953 | 1497 | |
| 53946 | 324 | < 0.1% |
| 53921 | 666 | |
| 53910 | 35 | < 0.1% |
| 53900 | 594 | 0.1% |
| 53876 | 81 | < 0.1% |
| 53854 | 511 | 0.1% |
| 53687 | 481 | < 0.1% |
| 53678 | 31 | < 0.1% |
| Distinct | 182 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 23.37246983 |
| Minimum | 1 |
|---|---|
| Maximum | 182 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 11 |
| median | 22 |
| Q3 | 33 |
| 95-th percentile | 51 |
| Maximum | 182 |
| Range | 181 |
| Interquartile range (IQR) | 22 |
Descriptive statistics
| Standard deviation | 16.11098948 |
|---|---|
| Coefficient of variation (CV) | 0.689314805 |
| Kurtosis | 7.11016077 |
| Mean | 23.37246983 |
| Median Absolute Deviation (MAD) | 11 |
| Skewness | 1.526384812 |
| Sum | 22760041 |
| Variance | 259.5639821 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 23092 | 2.4% |
| 2 | 23092 | 2.4% |
| 20 | 23092 | 2.4% |
| 19 | 23092 | 2.4% |
| 18 | 23092 | 2.4% |
| 17 | 23092 | 2.4% |
| 16 | 23092 | 2.4% |
| 15 | 23092 | 2.4% |
| 14 | 23092 | 2.4% |
| 13 | 23092 | 2.4% |
| Other values (172) | 742877 |
| Value | Count | Frequency (%) |
| 1 | 23092 | |
| 2 | 23092 | |
| 3 | 23092 | |
| 4 | 23092 | |
| 5 | 23092 | |
| 6 | 23092 | |
| 7 | 23092 | |
| 8 | 23092 | |
| 9 | 23092 | |
| 10 | 23092 |
| Value | Count | Frequency (%) |
| 182 | 23 | |
| 181 | 23 | |
| 180 | 23 | |
| 179 | 23 | |
| 178 | 23 | |
| 177 | 23 | |
| 176 | 23 | |
| 175 | 23 | |
| 174 | 23 | |
| 173 | 23 |
| Distinct | 40495 |
|---|---|
| Distinct (%) | 4.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.4 MiB |
| 2021-10-17T17:25:37.800 | 69 |
|---|---|
| 2021-10-17T17:25:39.100 | 69 |
| 2021-10-17T19:44:31.900 | 69 |
| 2021-10-17T19:44:32.000 | 69 |
| 2021-10-17T19:44:32.200 | 69 |
| Other values (40490) |
Length
| Max length | 23 |
|---|---|
| Median length | 23 |
| Mean length | 23 |
| Min length | 23 |
Characters and Unicode
| Total characters | 22397331 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2021-10-15T00:23:39.200 |
|---|---|
| 2nd row | 2021-10-15T00:23:39.300 |
| 3rd row | 2021-10-15T00:23:39.400 |
| 4th row | 2021-10-15T00:23:39.500 |
| 5th row | 2021-10-15T00:23:39.600 |
Common Values
| Value | Count | Frequency (%) |
| 2021-10-17T17:25:37.800 | 69 | < 0.1% |
| 2021-10-17T17:25:39.100 | 69 | < 0.1% |
| 2021-10-17T19:44:31.900 | 69 | < 0.1% |
| 2021-10-17T19:44:32.000 | 69 | < 0.1% |
| 2021-10-17T19:44:32.200 | 69 | < 0.1% |
| 2021-10-17T17:25:38.500 | 69 | < 0.1% |
| 2021-10-17T17:25:38.600 | 69 | < 0.1% |
| 2021-10-17T17:25:38.700 | 69 | < 0.1% |
| 2021-10-17T17:25:38.800 | 69 | < 0.1% |
| 2021-10-17T17:25:38.900 | 69 | < 0.1% |
| Other values (40485) | 973107 |
Length
| Value | Count | Frequency (%) |
| 2021-10-17t17:25:37.800 | 69 | < 0.1% |
| 2021-10-17t18:11:25.200 | 69 | < 0.1% |
| 2021-10-17t18:11:25.000 | 69 | < 0.1% |
| 2021-10-17t19:44:31.800 | 69 | < 0.1% |
| 2021-10-17t19:44:32.100 | 69 | < 0.1% |
| 2021-10-17t19:44:31.600 | 69 | < 0.1% |
| 2021-10-17t17:25:38.100 | 69 | < 0.1% |
| 2021-10-17t19:44:31.500 | 69 | < 0.1% |
| 2021-10-17t17:25:37.700 | 69 | < 0.1% |
| 2021-10-17t17:25:37.900 | 69 | < 0.1% |
| Other values (40485) | 973107 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4833703 | |
| 1 | 4196856 | |
| 2 | 2957432 | |
| - | 1947594 | |
| : | 1947594 | |
| 7 | 1229051 | 5.5% |
| T | 973797 | 4.3% |
| . | 973797 | 4.3% |
| 3 | 690161 | 3.1% |
| 5 | 674383 | 3.0% |
| Other values (4) | 1972963 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 16554549 | |
| Other Punctuation | 2921391 | 13.0% |
| Dash Punctuation | 1947594 | 8.7% |
| Uppercase Letter | 973797 | 4.3% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 4833703 | |
| 1 | 4196856 | |
| 2 | 2957432 | |
| 7 | 1229051 | 7.4% |
| 3 | 690161 | 4.2% |
| 5 | 674383 | 4.1% |
| 4 | 654166 | 4.0% |
| 9 | 546365 | 3.3% |
| 8 | 465520 | 2.8% |
| 6 | 306912 | 1.9% |
Other Punctuation
| Value | Count | Frequency (%) |
| : | 1947594 | |
| . | 973797 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 1947594 |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 973797 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 21423534 | |
| Latin | 973797 | 4.3% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 4833703 | |
| 1 | 4196856 | |
| 2 | 2957432 | |
| - | 1947594 | |
| : | 1947594 | |
| 7 | 1229051 | 5.7% |
| . | 973797 | 4.5% |
| 3 | 690161 | 3.2% |
| 5 | 674383 | 3.1% |
| 4 | 654166 | 3.1% |
| Other values (3) | 1318797 | 6.2% |
Latin
| Value | Count | Frequency (%) |
| T | 973797 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 22397331 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 4833703 | |
| 1 | 4196856 | |
| 2 | 2957432 | |
| - | 1947594 | |
| : | 1947594 | |
| 7 | 1229051 | 5.5% |
| T | 973797 | 4.3% |
| . | 973797 | 4.3% |
| 3 | 690161 | 3.1% |
| 5 | 674383 | 3.0% |
| Other values (4) | 1972963 |
| Distinct | 98 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 42339 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 49.42006296 |
| Minimum | 1 |
|---|---|
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 23 |
| median | 52 |
| Q3 | 75 |
| 95-th percentile | 96 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 52 |
Descriptive statistics
| Standard deviation | 29.91900492 |
|---|---|
| Coefficient of variation (CV) | 0.6054019993 |
| Kurtosis | -1.332975934 |
| Mean | 49.42006296 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 0.05194198004 |
| Sum | 46032713 |
| Variance | 895.1468555 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 21 | 22218 | 2.3% |
| 23 | 16706 | 1.7% |
| 24 | 16302 | 1.7% |
| 11 | 16233 | 1.7% |
| 97 | 16142 | 1.7% |
| 72 | 15692 | 1.6% |
| 2 | 15423 | 1.6% |
| 55 | 14710 | 1.5% |
| 31 | 14278 | 1.5% |
| 74 | 14258 | 1.5% |
| Other values (88) | 769496 | |
| (Missing) | 42339 | 4.3% |
| Value | Count | Frequency (%) |
| 1 | 13212 | |
| 2 | 15423 | |
| 3 | 5290 | 0.5% |
| 4 | 10200 | |
| 5 | 6716 | |
| 6 | 6470 | |
| 7 | 7590 | |
| 8 | 9144 | |
| 9 | 10500 | |
| 10 | 10350 |
| Value | Count | Frequency (%) |
| 99 | 13961 | |
| 98 | 13377 | |
| 97 | 16142 | |
| 96 | 9190 | |
| 95 | 6548 | |
| 94 | 11480 | |
| 93 | 9883 | |
| 92 | 6914 | |
| 91 | 13981 | |
| 90 | 14076 |
| Distinct | 29 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.4 MiB |
| football | 42339 |
|---|---|
| WAS | 39490 |
| KC | 39490 |
| MIN | 38863 |
| CAR | 38863 |
| Other values (24) |
Length
| Max length | 8 |
|---|---|
| Median length | 3 |
| Mean length | 3.00782812 |
| Min length | 2 |
Characters and Unicode
| Total characters | 2929014 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | TB |
|---|---|
| 2nd row | TB |
| 3rd row | TB |
| 4th row | TB |
| 5th row | TB |
Common Values
| Value | Count | Frequency (%) |
| football | 42339 | 4.3% |
| WAS | 39490 | 4.1% |
| KC | 39490 | 4.1% |
| MIN | 38863 | 4.0% |
| CAR | 38863 | 4.0% |
| JAX | 38577 | 4.0% |
| MIA | 38577 | 4.0% |
| DEN | 38522 | 4.0% |
| LV | 38522 | 4.0% |
| DET | 36333 | 3.7% |
| Other values (19) | 584221 |
Length
| Value | Count | Frequency (%) |
| football | 42339 | 4.3% |
| was | 39490 | 4.1% |
| kc | 39490 | 4.1% |
| min | 38863 | 4.0% |
| car | 38863 | 4.0% |
| jax | 38577 | 4.0% |
| mia | 38577 | 4.0% |
| den | 38522 | 4.0% |
| lv | 38522 | 4.0% |
| det | 36333 | 3.7% |
| Other values (19) | 584221 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 340780 | 11.6% |
| I | 255013 | 8.7% |
| N | 241428 | 8.2% |
| C | 204116 | 7.0% |
| E | 200585 | 6.8% |
| L | 196900 | 6.7% |
| D | 133364 | 4.6% |
| T | 128183 | 4.4% |
| B | 123882 | 4.2% |
| l | 84678 | 2.9% |
| Other values (20) | 1020085 |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 2590302 | |
| Lowercase Letter | 338712 | 11.6% |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 340780 | |
| I | 255013 | 9.8% |
| N | 241428 | 9.3% |
| C | 204116 | 7.9% |
| E | 200585 | 7.7% |
| L | 196900 | 7.6% |
| D | 133364 | 5.1% |
| T | 128183 | 4.9% |
| B | 123882 | 4.8% |
| H | 83842 | 3.2% |
| Other values (14) | 682209 |
Lowercase Letter
| Value | Count | Frequency (%) |
| l | 84678 | |
| o | 84678 | |
| a | 42339 | |
| b | 42339 | |
| t | 42339 | |
| f | 42339 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 2929014 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 340780 | 11.6% |
| I | 255013 | 8.7% |
| N | 241428 | 8.2% |
| C | 204116 | 7.0% |
| E | 200585 | 6.8% |
| L | 196900 | 6.7% |
| D | 133364 | 4.6% |
| T | 128183 | 4.4% |
| B | 123882 | 4.2% |
| l | 84678 | 2.9% |
| Other values (20) | 1020085 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2929014 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 340780 | 11.6% |
| I | 255013 | 8.7% |
| N | 241428 | 8.2% |
| C | 204116 | 7.0% |
| E | 200585 | 6.8% |
| L | 196900 | 6.7% |
| D | 133364 | 4.6% |
| T | 128183 | 4.4% |
| B | 123882 | 4.2% |
| l | 84678 | 2.9% |
| Other values (20) | 1020085 |
playDirection
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.4 MiB |
| left | |
|---|---|
| right |
Length
| Max length | 5 |
|---|---|
| Median length | 4 |
| Mean length | 4.478896526 |
| Min length | 4 |
Characters and Unicode
| Total characters | 4361536 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | right |
|---|---|
| 2nd row | right |
| 3rd row | right |
| 4th row | right |
| 5th row | right |
Common Values
| Value | Count | Frequency (%) |
| left | 507449 | |
| right | 466348 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| left | 507449 | |
| right | 466348 |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 973797 | |
| l | 507449 | |
| e | 507449 | |
| f | 507449 | |
| r | 466348 | |
| i | 466348 | |
| g | 466348 | |
| h | 466348 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4361536 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| t | 973797 | |
| l | 507449 | |
| e | 507449 | |
| f | 507449 | |
| r | 466348 | |
| i | 466348 | |
| g | 466348 | |
| h | 466348 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4361536 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| t | 973797 | |
| l | 507449 | |
| e | 507449 | |
| f | 507449 | |
| r | 466348 | |
| i | 466348 | |
| g | 466348 | |
| h | 466348 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4361536 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| t | 973797 | |
| l | 507449 | |
| e | 507449 | |
| f | 507449 | |
| r | 466348 | |
| i | 466348 | |
| g | 466348 | |
| h | 466348 |
x
Real number (ℝ≥0)
| Distinct | 11706 |
|---|---|
| Distinct (%) | 1.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.44660441 |
| Minimum | 0.94 |
|---|---|
| Maximum | 121.15 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 0.94 |
|---|---|
| 5-th percentile | 22.04 |
| Q1 | 40.17 |
| median | 58.8 |
| Q3 | 78.67 |
| 95-th percentile | 98.82 |
| Maximum | 121.15 |
| Range | 120.21 |
| Interquartile range (IQR) | 38.5 |
Descriptive statistics
| Standard deviation | 23.94899835 |
|---|---|
| Coefficient of variation (CV) | 0.4028657076 |
| Kurtosis | -0.8539693484 |
| Mean | 59.44660441 |
| Median Absolute Deviation (MAD) | 19.22 |
| Skewness | 0.07121546106 |
| Sum | 57888925.03 |
| Variance | 573.5545218 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 33.76 | 197 | < 0.1% |
| 56.45 | 195 | < 0.1% |
| 61.05 | 187 | < 0.1% |
| 39.33 | 186 | < 0.1% |
| 35.72 | 184 | < 0.1% |
| 33.45 | 184 | < 0.1% |
| 33.58 | 180 | < 0.1% |
| 40.84 | 180 | < 0.1% |
| 40.59 | 179 | < 0.1% |
| 33.98 | 179 | < 0.1% |
| Other values (11696) | 971946 |
| Value | Count | Frequency (%) |
| 0.94 | 1 | |
| 1 | 2 | |
| 1.01 | 1 | |
| 1.03 | 1 | |
| 1.04 | 1 | |
| 1.07 | 2 | |
| 1.1 | 1 | |
| 1.12 | 1 | |
| 1.17 | 1 | |
| 1.18 | 2 |
| Value | Count | Frequency (%) |
| 121.15 | 2 | |
| 121.14 | 2 | |
| 121.11 | 2 | |
| 121.08 | 1 | |
| 121.03 | 1 | |
| 120.98 | 1 | |
| 120.92 | 1 | |
| 120.85 | 1 | |
| 120.77 | 1 | |
| 120.68 | 1 |
y
Real number (ℝ)
| Distinct | 5399 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.70109764 |
| Minimum | -2.55 |
|---|---|
| Maximum | 55.94 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 42 |
| Negative (%) | < 0.1% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | -2.55 |
|---|---|
| 5-th percentile | 11.43 |
| Q1 | 21.83 |
| median | 26.7 |
| Q3 | 31.55 |
| 95-th percentile | 42.02 |
| Maximum | 55.94 |
| Range | 58.49 |
| Interquartile range (IQR) | 9.72 |
Descriptive statistics
| Standard deviation | 8.363680499 |
|---|---|
| Coefficient of variation (CV) | 0.3132335836 |
| Kurtosis | 0.2775235246 |
| Mean | 26.70109764 |
| Median Absolute Deviation (MAD) | 4.86 |
| Skewness | 0.004224886474 |
| Sum | 26001448.78 |
| Variance | 69.95115149 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 23.8 | 981 | 0.1% |
| 23.83 | 979 | 0.1% |
| 23.86 | 972 | 0.1% |
| 23.89 | 967 | 0.1% |
| 23.81 | 965 | 0.1% |
| 23.82 | 930 | 0.1% |
| 23.9 | 928 | 0.1% |
| 23.79 | 915 | 0.1% |
| 23.73 | 909 | 0.1% |
| 23.78 | 909 | 0.1% |
| Other values (5389) | 964342 |
| Value | Count | Frequency (%) |
| -2.55 | 1 | |
| -2.54 | 1 | |
| -2.53 | 1 | |
| -2.5 | 1 | |
| -2.49 | 1 | |
| -2.43 | 1 | |
| -2.41 | 1 | |
| -2.32 | 1 | |
| -2.3 | 1 | |
| -2.18 | 1 |
| Value | Count | Frequency (%) |
| 55.94 | 1 | |
| 55.31 | 1 | |
| 55.03 | 1 | |
| 54.94 | 2 | |
| 54.89 | 1 | |
| 54.88 | 1 | |
| 54.82 | 1 | |
| 54.8 | 1 | |
| 54.78 | 1 | |
| 54.68 | 2 |
| Distinct | 2139 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.616375723 |
| Minimum | 0 |
|---|---|
| Maximum | 28.78 |
| Zeros | 62663 |
| Zeros (%) | 6.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.78 |
| median | 2.17 |
| Q3 | 3.86 |
| 95-th percentile | 6.81 |
| Maximum | 28.78 |
| Range | 28.78 |
| Interquartile range (IQR) | 3.08 |
Descriptive statistics
| Standard deviation | 2.403339724 |
|---|---|
| Coefficient of variation (CV) | 0.9185759152 |
| Kurtosis | 14.23207695 |
| Mean | 2.616375723 |
| Median Absolute Deviation (MAD) | 1.51 |
| Skewness | 2.336976597 |
| Sum | 2547818.83 |
| Variance | 5.77604183 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 62663 | 6.4% |
| 0.01 | 15252 | 1.6% |
| 0.02 | 8659 | 0.9% |
| 0.03 | 6204 | 0.6% |
| 0.04 | 5130 | 0.5% |
| 0.05 | 4374 | 0.4% |
| 0.06 | 3838 | 0.4% |
| 0.07 | 3526 | 0.4% |
| 0.08 | 3261 | 0.3% |
| 0.09 | 3208 | 0.3% |
| Other values (2129) | 857682 |
| Value | Count | Frequency (%) |
| 0 | 62663 | |
| 0.01 | 15252 | 1.6% |
| 0.02 | 8659 | 0.9% |
| 0.03 | 6204 | 0.6% |
| 0.04 | 5130 | 0.5% |
| 0.05 | 4374 | 0.4% |
| 0.06 | 3838 | 0.4% |
| 0.07 | 3526 | 0.4% |
| 0.08 | 3261 | 0.3% |
| 0.09 | 3208 | 0.3% |
| Value | Count | Frequency (%) |
| 28.78 | 1 | |
| 28.61 | 1 | |
| 28.38 | 1 | |
| 28.04 | 1 | |
| 27.87 | 1 | |
| 27.74 | 1 | |
| 27.43 | 1 | |
| 27.24 | 1 | |
| 27.13 | 1 | |
| 27.04 | 1 |
| Distinct | 1549 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.80187497 |
| Minimum | 0 |
|---|---|
| Maximum | 28.4 |
| Zeros | 58410 |
| Zeros (%) | 6.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.73 |
| median | 1.55 |
| Q3 | 2.59 |
| 95-th percentile | 4.48 |
| Maximum | 28.4 |
| Range | 28.4 |
| Interquartile range (IQR) | 1.86 |
Descriptive statistics
| Standard deviation | 1.446293637 |
|---|---|
| Coefficient of variation (CV) | 0.8026603738 |
| Kurtosis | 7.029842627 |
| Mean | 1.80187497 |
| Median Absolute Deviation (MAD) | 0.91 |
| Skewness | 1.491214838 |
| Sum | 1754660.44 |
| Variance | 2.091765284 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 58410 | 6.0% |
| 0.01 | 11797 | 1.2% |
| 0.02 | 6611 | 0.7% |
| 0.03 | 5156 | 0.5% |
| 0.04 | 4135 | 0.4% |
| 0.05 | 3517 | 0.4% |
| 1.12 | 3142 | 0.3% |
| 1.19 | 3141 | 0.3% |
| 1.37 | 3127 | 0.3% |
| 1.14 | 3126 | 0.3% |
| Other values (1539) | 871635 |
| Value | Count | Frequency (%) |
| 0 | 58410 | |
| 0.01 | 11797 | 1.2% |
| 0.02 | 6611 | 0.7% |
| 0.03 | 5156 | 0.5% |
| 0.04 | 4135 | 0.4% |
| 0.05 | 3517 | 0.4% |
| 0.06 | 2958 | 0.3% |
| 0.07 | 2744 | 0.3% |
| 0.08 | 2387 | 0.2% |
| 0.09 | 2204 | 0.2% |
| Value | Count | Frequency (%) |
| 28.4 | 1 | |
| 27.22 | 1 | |
| 26.73 | 1 | |
| 26.03 | 1 | |
| 26.02 | 1 | |
| 25.86 | 1 | |
| 25.42 | 1 | |
| 25.01 | 1 | |
| 24.81 | 1 | |
| 24.59 | 1 |
| Distinct | 537 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.2649353921 |
| Minimum | 0 |
|---|---|
| Maximum | 8.9 |
| Zeros | 61876 |
| Zeros (%) | 6.4% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0.08 |
| median | 0.22 |
| Q3 | 0.39 |
| 95-th percentile | 0.68 |
| Maximum | 8.9 |
| Range | 8.9 |
| Interquartile range (IQR) | 0.31 |
Descriptive statistics
| Standard deviation | 0.2578158758 |
|---|---|
| Coefficient of variation (CV) | 0.9731273492 |
| Kurtosis | 51.68517049 |
| Mean | 0.2649353921 |
| Median Absolute Deviation (MAD) | 0.15 |
| Skewness | 4.260855707 |
| Sum | 257993.29 |
| Variance | 0.06646902581 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 61876 | 6.4% |
| 0.01 | 50588 | 5.2% |
| 0.02 | 29000 | 3.0% |
| 0.03 | 22494 | 2.3% |
| 0.04 | 19722 | 2.0% |
| 0.05 | 18535 | 1.9% |
| 0.2 | 17952 | 1.8% |
| 0.17 | 17891 | 1.8% |
| 0.18 | 17857 | 1.8% |
| 0.16 | 17851 | 1.8% |
| Other values (527) | 700031 |
| Value | Count | Frequency (%) |
| 0 | 61876 | |
| 0.01 | 50588 | |
| 0.02 | 29000 | |
| 0.03 | 22494 | 2.3% |
| 0.04 | 19722 | 2.0% |
| 0.05 | 18535 | 1.9% |
| 0.06 | 17634 | 1.8% |
| 0.07 | 17233 | 1.8% |
| 0.08 | 16904 | 1.7% |
| 0.09 | 16761 | 1.7% |
| Value | Count | Frequency (%) |
| 8.9 | 1 | |
| 7.59 | 1 | |
| 7.1 | 1 | |
| 6.83 | 1 | |
| 6.56 | 1 | |
| 6.52 | 1 | |
| 6.47 | 1 | |
| 6.28 | 1 | |
| 6.27 | 1 | |
| 6.25 | 1 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.9% |
| Missing | 42339 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.2660642 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 8 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 31.85 |
| Q1 | 89.81 |
| median | 178.39 |
| Q3 | 270.31 |
| 95-th percentile | 329.92 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 180.5 |
Descriptive statistics
| Standard deviation | 99.30368715 |
|---|---|
| Coefficient of variation (CV) | 0.5508728867 |
| Kurtosis | -1.375388446 |
| Mean | 180.2660642 |
| Median Absolute Deviation (MAD) | 90.26 |
| Skewness | 0.006370240159 |
| Sum | 167910267.6 |
| Variance | 9861.222281 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 90 | 1992 | 0.2% |
| 266.51 | 101 | < 0.1% |
| 84.48 | 97 | < 0.1% |
| 266.9 | 97 | < 0.1% |
| 273.37 | 95 | < 0.1% |
| 274.54 | 94 | < 0.1% |
| 92.63 | 92 | < 0.1% |
| 80.04 | 92 | < 0.1% |
| 85.5 | 91 | < 0.1% |
| 270.34 | 91 | < 0.1% |
| Other values (35991) | 928616 | |
| (Missing) | 42339 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 8 | < 0.1% |
| 0.01 | 12 | |
| 0.02 | 10 | < 0.1% |
| 0.03 | 14 | |
| 0.04 | 16 | |
| 0.05 | 10 | < 0.1% |
| 0.06 | 7 | < 0.1% |
| 0.07 | 29 | |
| 0.08 | 21 | |
| 0.09 | 11 | < 0.1% |
| Value | Count | Frequency (%) |
| 360 | 10 | |
| 359.99 | 21 | |
| 359.98 | 20 | |
| 359.97 | 8 | < 0.1% |
| 359.96 | 18 | |
| 359.95 | 19 | |
| 359.94 | 16 | |
| 359.93 | 13 | |
| 359.92 | 12 | |
| 359.91 | 19 |
| Distinct | 36001 |
|---|---|
| Distinct (%) | 3.9% |
| Missing | 42339 |
| Missing (%) | 4.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 180.3159753 |
| Minimum | 0 |
|---|---|
| Maximum | 360 |
| Zeros | 24 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 7.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 24.01 |
| Q1 | 90.56 |
| median | 179.98 |
| Q3 | 270.48 |
| 95-th percentile | 336.32 |
| Maximum | 360 |
| Range | 360 |
| Interquartile range (IQR) | 179.92 |
Descriptive statistics
| Standard deviation | 101.0175078 |
|---|---|
| Coefficient of variation (CV) | 0.5602249477 |
| Kurtosis | -1.289176848 |
| Mean | 180.3159753 |
| Median Absolute Deviation (MAD) | 89.96 |
| Skewness | 2.518902752 × 10-5 |
| Sum | 167956757.7 |
| Variance | 10204.53689 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 95.24 | 65 | < 0.1% |
| 91.88 | 65 | < 0.1% |
| 88.24 | 65 | < 0.1% |
| 93.04 | 65 | < 0.1% |
| 96.17 | 65 | < 0.1% |
| 91.81 | 64 | < 0.1% |
| 268.47 | 64 | < 0.1% |
| 94.39 | 64 | < 0.1% |
| 268.37 | 64 | < 0.1% |
| 269.48 | 64 | < 0.1% |
| Other values (35991) | 930813 | |
| (Missing) | 42339 | 4.3% |
| Value | Count | Frequency (%) |
| 0 | 24 | |
| 0.01 | 16 | |
| 0.02 | 19 | |
| 0.03 | 28 | |
| 0.04 | 12 | |
| 0.05 | 16 | |
| 0.06 | 20 | |
| 0.07 | 11 | < 0.1% |
| 0.08 | 20 | |
| 0.09 | 17 |
| Value | Count | Frequency (%) |
| 360 | 8 | < 0.1% |
| 359.99 | 21 | |
| 359.98 | 26 | |
| 359.97 | 24 | |
| 359.96 | 11 | |
| 359.95 | 24 | |
| 359.94 | 14 | |
| 359.93 | 18 | |
| 359.92 | 17 | |
| 359.91 | 20 |
event
Categorical
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 7.4 MiB |
| None | |
|---|---|
| ball_snap | 23046 |
| pass_forward | 20585 |
| autoevent_passforward | 10488 |
| autoevent_ballsnap | 10327 |
| Other values (15) | 11707 |
Length
| Max length | 25 |
|---|---|
| Median length | 4 |
| Mean length | 4.699426061 |
| Min length | 3 |
Characters and Unicode
| Total characters | 4576287 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | None |
|---|---|
| 2nd row | None |
| 3rd row | None |
| 4th row | None |
| 5th row | None |
Common Values
| Value | Count | Frequency (%) |
| None | 897644 | |
| ball_snap | 23046 | 2.4% |
| pass_forward | 20585 | 2.1% |
| autoevent_passforward | 10488 | 1.1% |
| autoevent_ballsnap | 10327 | 1.1% |
| play_action | 5750 | 0.6% |
| run | 1173 | 0.1% |
| qb_sack | 1104 | 0.1% |
| pass_arrived | 897 | 0.1% |
| autoevent_passinterrupted | 667 | 0.1% |
| Other values (10) | 2116 | 0.2% |
Length
| Value | Count | Frequency (%) |
| none | 897644 | |
| ball_snap | 23046 | 2.4% |
| pass_forward | 20585 | 2.1% |
| autoevent_passforward | 10488 | 1.1% |
| autoevent_ballsnap | 10327 | 1.1% |
| play_action | 5750 | 0.6% |
| run | 1173 | 0.1% |
| qb_sack | 1104 | 0.1% |
| pass_arrived | 897 | 0.1% |
| autoevent_passinterrupted | 667 | 0.1% |
| Other values (10) | 2116 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 961906 | |
| o | 957559 | |
| e | 944518 | |
| N | 897644 | |
| a | 166727 | 3.6% |
| s | 102097 | 2.2% |
| _ | 75164 | 1.6% |
| p | 73830 | 1.6% |
| l | 73048 | 1.6% |
| r | 66815 | 1.5% |
| Other values (15) | 256979 | 5.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3603479 | |
| Uppercase Letter | 897644 | 19.6% |
| Connector Punctuation | 75164 | 1.6% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| n | 961906 | |
| o | 957559 | |
| e | 944518 | |
| a | 166727 | 4.6% |
| s | 102097 | 2.8% |
| p | 73830 | 2.0% |
| l | 73048 | 2.0% |
| r | 66815 | 1.9% |
| t | 52739 | 1.5% |
| b | 34661 | 1.0% |
| Other values (13) | 169579 | 4.7% |
Uppercase Letter
| Value | Count | Frequency (%) |
| N | 897644 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 75164 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 4501123 | |
| Common | 75164 | 1.6% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| n | 961906 | |
| o | 957559 | |
| e | 944518 | |
| N | 897644 | |
| a | 166727 | 3.7% |
| s | 102097 | 2.3% |
| p | 73830 | 1.6% |
| l | 73048 | 1.6% |
| r | 66815 | 1.5% |
| t | 52739 | 1.2% |
| Other values (14) | 204240 | 4.5% |
Common
| Value | Count | Frequency (%) |
| _ | 75164 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4576287 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| n | 961906 | |
| o | 957559 | |
| e | 944518 | |
| N | 897644 | |
| a | 166727 | 3.6% |
| s | 102097 | 2.2% |
| _ | 75164 | 1.6% |
| p | 73830 | 1.6% |
| l | 73048 | 1.6% |
| r | 66815 | 1.5% |
| Other values (15) | 256979 | 5.6% |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021101400 | 76 | 25511.0 | 1 | 2021-10-15T00:23:39.200 | 12.0 | TB | right | 33.99 | 23.88 | 0.0 | 0.0 | 0.00 | 91.73 | 67.89 | None |
| 1 | 2021101400 | 76 | 25511.0 | 2 | 2021-10-15T00:23:39.300 | 12.0 | TB | right | 33.99 | 23.88 | 0.0 | 0.0 | 0.00 | 91.73 | 57.67 | None |
| 2 | 2021101400 | 76 | 25511.0 | 3 | 2021-10-15T00:23:39.400 | 12.0 | TB | right | 34.00 | 23.89 | 0.0 | 0.0 | 0.01 | 91.73 | 49.03 | None |
| 3 | 2021101400 | 76 | 25511.0 | 4 | 2021-10-15T00:23:39.500 | 12.0 | TB | right | 34.00 | 23.89 | 0.0 | 0.0 | 0.00 | 91.73 | 47.57 | None |
| 4 | 2021101400 | 76 | 25511.0 | 5 | 2021-10-15T00:23:39.600 | 12.0 | TB | right | 34.00 | 23.89 | 0.0 | 0.0 | 0.00 | 91.73 | 50.68 | None |
| 5 | 2021101400 | 76 | 25511.0 | 6 | 2021-10-15T00:23:39.700 | 12.0 | TB | right | 34.00 | 23.89 | 0.0 | 0.0 | 0.00 | 91.02 | 46.28 | ball_snap |
| 6 | 2021101400 | 76 | 25511.0 | 7 | 2021-10-15T00:23:39.800 | 12.0 | TB | right | 33.99 | 23.89 | 0.0 | 0.0 | 0.00 | 91.02 | 34.90 | autoevent_ballsnap |
| 7 | 2021101400 | 76 | 25511.0 | 8 | 2021-10-15T00:23:39.900 | 12.0 | TB | right | 33.99 | 23.89 | 0.0 | 0.0 | 0.01 | 91.02 | 2.17 | None |
| 8 | 2021101400 | 76 | 25511.0 | 9 | 2021-10-15T00:23:40.000 | 12.0 | TB | right | 33.98 | 23.89 | 0.0 | 0.0 | 0.01 | 91.02 | 320.86 | None |
| 9 | 2021101400 | 76 | 25511.0 | 10 | 2021-10-15T00:23:40.100 | 12.0 | TB | right | 33.97 | 23.89 | 0.0 | 0.0 | 0.01 | 91.02 | 297.69 | None |
Last rows
| gameId | playId | nflId | frameId | time | jerseyNumber | team | playDirection | x | y | s | a | dis | o | dir | event | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 973787 | 2021101800 | 3998 | NaN | 33 | 2021-10-19T03:18:54.200 | NaN | football | right | 94.21 | 23.25 | 0.69 | 0.99 | 0.16 | NaN | NaN | None |
| 973788 | 2021101800 | 3998 | NaN | 34 | 2021-10-19T03:18:54.300 | NaN | football | right | 94.45 | 23.32 | 0.73 | 0.66 | 0.25 | NaN | NaN | None |
| 973789 | 2021101800 | 3998 | NaN | 35 | 2021-10-19T03:18:54.400 | NaN | football | right | 94.59 | 23.33 | 0.77 | 0.70 | 0.13 | NaN | NaN | None |
| 973790 | 2021101800 | 3998 | NaN | 36 | 2021-10-19T03:18:54.500 | NaN | football | right | 94.67 | 23.32 | 0.77 | 0.71 | 0.09 | NaN | NaN | None |
| 973791 | 2021101800 | 3998 | NaN | 37 | 2021-10-19T03:18:54.600 | NaN | football | right | 94.75 | 23.32 | 0.77 | 0.69 | 0.08 | NaN | NaN | run |
| 973792 | 2021101800 | 3998 | NaN | 38 | 2021-10-19T03:18:54.700 | NaN | football | right | 94.87 | 23.35 | 0.82 | 0.78 | 0.12 | NaN | NaN | None |
| 973793 | 2021101800 | 3998 | NaN | 39 | 2021-10-19T03:18:54.800 | NaN | football | right | 95.03 | 23.42 | 1.05 | 1.09 | 0.18 | NaN | NaN | None |
| 973794 | 2021101800 | 3998 | NaN | 40 | 2021-10-19T03:18:54.900 | NaN | football | right | 95.42 | 23.65 | 1.91 | 2.38 | 0.45 | NaN | NaN | None |
| 973795 | 2021101800 | 3998 | NaN | 41 | 2021-10-19T03:18:55.000 | NaN | football | right | 95.79 | 23.86 | 2.65 | 3.68 | 0.43 | NaN | NaN | None |
| 973796 | 2021101800 | 3998 | NaN | 42 | 2021-10-19T03:18:55.100 | NaN | football | right | 96.24 | 24.13 | 3.63 | 0.44 | 0.52 | NaN | NaN | None |